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OO i Abstract. Weexplicitly construct infinite families of MSTD (more sums than differ- 

I ences) sets, i.e., sets where + A| > |^ — A|. There are enough of these sets to prove 

that there exists a constant C such that at least C/r^ of the 2'' subsets of {1, . . . , r} are 
MSTD sets; thus our family is significantly denser than previous constructions (whose 
^ . densities are at most /(r)/2''/^ for some polynomial f{r)). We conclude by general- 

I izing our method to compare linear forms eiA + • • • + e„A with G { — 1, !}■ 
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1. INTRODUCTION 

Given a finite set of integers A, we define its sumset A + A and difference A — A 



by 



■ A + A = {cj + aj : Oj, aj G A} 

A — A = {tti — Qj : ai,aj E A}, (1.1) 

and let |X I denote the cardinality of X. If |yl+yl| > |yl— A |, then, following Nathanson, 
^ I we call A an MSTD (more sums than differences) set. As addition is commutative while 

subtraction is not, we expect that for a 'generic' set A we have |yl — A| > |A + as a 
CN . typical pair (x, y) contributes one sum and two differences; thus we expect MSTD sets 

to be rare. 

Martin and O'Bryant [MOIproved that, in some sense, this intuition is wrong. They 
considered the uniform modeu for choosing a subset A of {1, ... , n}, and showed that 



OO . there is a positive probability that a random subset A is an MSTD set (though, not 

^ ' surprisingly, the probability is quite small). However, the answer is very different for 

other ways of choosing subsets randomly, and if we decrease slightly the probability 
^ ■ an element is chosen then our intuition is correct. Specifically, consider the binomial 

model with parameter p(n), with lim^^oo p{n) = and = o{p{n)) (so p{n) doesn't 
tend to zero so rapidly that the sets are too sparse) @ Hegarty and Miller HHMH recently 
proved that, in the limit as n ^ 0, the percentage of subsets of {1, . . . , n} that are 
MSTD sets tends to zero in this model. 

Though MSTD sets are rare, they do exist (and, in the uniform model, are some- 
what abundant by the work of Martin and O'Bryant). Examples go back to the 1960s. 
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^This model means that the probability fc G {1, . . . , n} is in A is p{n). 
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Conway is said to have discovered {0, 2, 3, 4, 7, 11, 12, 14}, while Marica UMaH gave 
{0, 1, 2, 4, 7, 8, 12, 14, 15} in 1969 and Freiman and Pigarev [FPJ found {0, 1, 2, 4, 5, 
9, 12, 13, 14, 16, 17, 21, 24, 25, 26, 28, 29} in 1973. Recent work includes infinite fam- 
ilies constructed by Hegarty HHel and Nathanson IINa2L as well as existence proofs by 
Ruzsa llRuniRu2llRu3l . 

Most of the previous construction^ of infinite families of MSTD sets start with a 
symmetric set which is then 'perturbed' slightly through the careful addition of a few 
elements that increase the number of sums more than the number of differences; see 
[|He[ |Na2|| for a description of some previous constructions and methods. In many 
cases, these symmetric sets are arithmetic progressions; such sets are natural starting 
points because if A is an arithmetic progression, then |A + A| = |A — A|0 

In this work we present a new method which takes an MSTD set satisfying certain 
conditions and constructs an infinite family of MSTD sets. While these families are not 
dense enough to prove a positive percentage of subsets of {1, . . . , r} are MSTD sets, 
we are able to elementarily show that the percentage is at least C/r"^ for some constant 
C. Thus our families are far denser than those in HHel |Na2|| ; trivial counting!! shows 
all of their infinite families give at most /(r)2'"/^ of the subsets of {1, . . . , r} (for some 
polynomial /(r)) are MSTD sets, implying a percentage of at most /(r)/2''/^. 

We first introduce some notation. The first is a common convention, while the second 
codifies a property which we've found facilitates the construction of MSTD sets. 

• We let [a, b] denote all integers from a to b; thus [a,b] = {n E Z : a < n < b}. 

• We say a set of integers A has the property P„ (or is a _P„-set) if both its sumset 
and its difference set contain all but the first and last n possible elements (and 
of course it may or may not contain some of these fringe elements) B Explicitly, 
let a = min A and b = max A. Then A is a -P„-set if 

[2a + n, 2b -n] C A + A (1.2) 



An alternate method constructs an infinite family from a given MSTD set A by considering At — 
{J2l=i aim^~^ : ai G A}. For m sufficiently large, these will be MSTD sets; this is called the base 
expansion method. Note, however, that these will be very sparse. See OHell for more details. 

^As I A + and | A — A| are not changed by mapping each x G Ato ax + P for any fixed a and /3, we 
may assume our arithmetic progression is just {0, . . . , n}, and thus the cardinality of each set is 2n + 1. 

%or example, consider the following construction of MSTD sets from 0Na2l : let m,,d,k G N with 
m>A, l<d<m-l, m/2, fc > 3 if d < m/2 else fc > 4. Set B = [0, m - l]\{d}, L = 
{m-d,2m-d, . . .,km~d},a* ^ (fc + l)m-2dand A = BULU{a* - B)U{m}. Then A is an MSTD 
set. The width of such a set is of the order km. Thus, if we look at all triples (m, d, k) with km < r 
satisfying the above conditions, these generate on the order of at most J2k<r '^m<r/k X]d<m ^ ^ 
and there are of the order 2^ possible subsets of {0, ... , ?■}; thus this construction generates a negligible 
number of MSTD sets. Though we write f{r)/2^/^ to bound the percentage from other methods, a more 
careful analysis shows it is significantly less; we prefer this easier bound as it is already significantly less 
than our method. See for example Theorem 2 of [He] for a denser example. 

^It is not hard to show that for fixed < a < 1 a random set drawn from [1, rt] in the uniform model 
is a Pian\ -set with probability approaching 1 as n — + oo. 
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and 



[-{b - a) + n, {b - a) - n] C A - A. 



(1.3) 



We can now state our construction and main result. 

Theorem 1.1. Let A = LU Rbe a P„, MSTD set where L C [1, n], R C [n + 1, 2n], 
and 1, 2n G aQ see Remark lL2[ for an example of such an A. Fix a k > n and let m be 
arbitrary. Let M be any subset o/ [n + + 1, n + + m] with the property that it does 
not have a run of more than k missing elements ( i.e., for all i & [n + k + l,n + m + 1] 
there is a j G [i,i + k — 1] such that j G M). Assume further that n + k + 1 ^ M 
and set A{M; k) = L U Oi U M U O2 U R', where Oi = [n + l,n + k], O2 = 
[n + k + m + l,n + 2k + 'm] (thus the Oi 's are just sets ofk consecutive integers), and 
R' = R + 2k + m. Then 

(1) A{M; k) is an MSTD set, and thus we obtain an infinite family of distinct MSTD 
sets as M varies; 

(2) there is a constant C > such that as r ^ 00 the percentage of subsets of 
{1, . . . , r} that are in this family (and thus are MSTD sets) is at least C/r^. 

Remark 1.2. In order to show that our theorem is not trivial, we must of course exhibit 
at least one Pn, MSTD set A satisfying all our requirements ( else our family is empty!). 



We may take the se% A = {1, 2, 3, 5, 8, 9, 13, 15, 16}; it is an MSTD set as 
A + A = {2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21, 

22, 23, 24, 25, 26, 28, 29, 30, 31, 32} 
A- A = {-15,-14,-13,-12,-11,-10,-8,-7,-6,-5,-4,-3,-2,-1, 



(so \A + A\ = 30 > 29 = \A — A\). A is also a Pn-set, as (|1.2I) is satisfied since 
[10, 2A]c A + A and (O) is satisfied since [-7, 7] C A - A. 

For the uniform model, a subset of [1 , 2n] is a Pn-set with high probability as n ^ 00, 
and thus examples of this nature are plentiful. For example, of the 1748 MSTD sets with 
minimum 1 and maximum 24, 1008 are Pn-sets. 

Unlike other estimates on the percentage of MSTD sets, our arguments are not proba- 
bilistic, and rely on explicitly constructing large families of MSTD sets. Our arguments 
share some similarities with the methods in HHeH (see for example Case I of Theorem 
8) and HMOL There the fringe elements of the set were also chosen first. A random 
set was then added in the middle, and the authors argued that with high probability the 
resulting set is an MSTD set. We can almost add a random set in the middle; the reason 
we do not obtain a positive percentage is that we have the restriction that there can be 
no consecutive block of size k of numbers in the middle that are not chosen to be in 

^Requiring 1, 2n e A is quite mild; we do this so that we know the first and last elements of A. 

^This A is trivially modified from |Mal by adding 1 to each element, as we start our sets with 1 while 
other authors start with 0. We chose this set as our example as it has several additional nice properties 
that were needed in earlier versions of our construction which required us to assume slightly more about 
A. 



0,1,2,3,4 



5,6,7,8,10,11,12,13,14,15} 



(1.4) 
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A{M; k). This is easily satisfied by requiring us to choose at least one number in con- 
secutive blocks of size k/2, and this is what leads to the loss of a positive percentage|^ 
(though we do obtain sets that are known to be MSTD sets, and not just highly likely to 
be MSTD sets). 

The paper is organized as follows. We describe our construction in ^ and prove our 
claimed lower bounds for the percentage of sets that are MSTD sets in ^3] We then 
generalize our construction in ^ and explore when there are infinite families of sets 
satisfying 

|eiA+--- + e„A| > + ei,?i G {-1,1}. (1.5) 

We end with some concluding remarks and suggestions for future research in ^ 

2. Construction of infinite families of MSTD sets 

Let A C [1, 2n]. We can write this set as A = L U i? where L C [1, n] and R C 
[n + 1, 2n] . We have 

A + A = [L + L]U[L + R]U[R + R] (2.1) 

where L + L C [2, 2n], L + R C [n + 2,3n] and R + R C [2n + 2, An], and 

A-A= [L - R]U [L - L]U [R - R]U [R - L] (2.2) 

where L-R c[-l, -2n +1],L-Lc [-{n - l),n - 1], R - R C [-{n - 1), n-1] 
andR-Lc [l,2n - 1]. 

A typical subset A of {1, ... , 2n} (chosen from the uniform model, see Footnoted]) 
will be a P„-set (see Footnote[6l). It is thus the interaction of the "fringe" elements that 
largely determines whether a given set is an MSTD set. Our construction begins with a 
set A that is both an MSTD set and a P„-set. We construct a family of P„, MSTD sets 
by inserting elements into the middle in such a way that the new set is a P„-set, and the 
number of added sums is equal to the number of added differences. Thus the new set is 
also an MSTD set. 

In creating MSTD sets, it is very useful to know that we have a P„-set. The reason 
is that we have all but the "fringe" possible sums and differences, and are thus reduced 
to studying the extreme sums and differences. The following lemma shows that if A 
is a Pn, MSTD set and a certain extension of A is a Pn-set, then this extension is also 
an MSTD set. The difficult step in our construction is determining a large class of 
extensions which lead to P„-sets; we will do this in Lemma |231 

Lemma 2.1. Let A = L U R be a Pn-set where L G [l,n] and R C [n + 1, 2n\. Form 
A' = LVJMUR' where M C [n + l,n + m] and R' = R + m. If A' is a Pn-set then 
\A' + A'\ — \A + A\ = \A' — A'\ — \A — A\ = 2m (i.e., the number of added sums is 
equal to the number of added differences). In particular, if A is an MSTD set then so is 
A. 

Proof. We first count the number of added sums. In the interval [2, n + 1] both A + A 
and A' + A' are identical, as any sum can come only from terms in L + L. Similarly, 
we can pair the sums of A + Am the region [3n + 1, An] with the sums of A' + A' in 

^Without this requirement, we could take any M and thus would have a positive percentage work, 
specifically at least 2^(2''+2"). 
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the region [3n + 2m + 1, 4?2 + 2m], as these can come only from R + R and (R + m) + 
(R + m) respectively. Since we have accounted for the n smallest and largest terms 
in both A + A and A' + A', and as both are P„-sets, the number of added sums is just 
(3n + 2m + 1) - (3n + 1) = 2m. 

Similarly, differences in the interval [1 — 2n, — n] that come from L — R can be 
paired with the corresponding terms from L — (i? + m), and differences in the interval 
[n, 2n — 1] from R — L can be paired with differences coming from (i? + m) — L. 
Thus the size of the middle grows from the interval [—n + l,n — 1] to the interval 
[— n — m + l,n + m — 1]. Thus we have added (2n + 2m + 3) — {2n + 3) = 2m 
differences. Thus \A' + A'\ -\A + A\ = \A' - A'\ - \ A - A\ = 2m as desired. □ 

The above lemma is not surprising, as in it we assume A' is a P„-set; the difficulty 
in our construction is showing that our new set A{M; k) is also a P„-set for suitably 
chosen M. This requirement forces us to introduce the sets Oj (which are blocks of k 
consecutive integers), as well as requiring M to have at least one of every k consecutive 
integers. 

We are now ready to prove the first part of Theorem 11.11 by constructing an infinite 
family of distinct P„, MSTD sets. We take a P„, MSTD set and insert a set in such a 
way that it remains a P„-set; thus by Lemma [211 we see that this new set is an MSTD 
set. 

Lemma 2.2. Let A = L U R be a Pn-set where L C R C [n + l,2n], and 

l,2n E A. Fix a k > n and let m be arbitrary. Choose any Mc [n + /c + l,n + A; + m] 
with the property that M does not have a run of more than k missing elements, and 
form A{M] k) = L U d U M U O2 U P' where d = [n + 1, n + fc], O2 = [n + k + 
m + 1, n + 2fc + m], and R' = R + 2k + m. Then A{M] k) is a Pn-set. 

Proof. For notational convenience, denote A{M; k) by A' . Note A' + A' C [2, An+Ak + 
2m] . We begin by showing that there are no missing sums from n + 2 to 3n + 4fc + 2m; 
proving an analogous statement for A' — A' shows A' is a P„-set. By symmetr)0 we 
only have to show that there are no missing sums in [n + 2,2n + 2k + m]. We consider 
various ranges in turn. 

We observe that [n + 2,n + k + 1] C A' + A' because we have 1 E L and these 
sums resuk from 1 + Oi. Additionally, Oi + Oi = [2n + 2,2n + 2k] C A' + A'. 
Since n < kwQ have n + k + l>2n + l, these two regions are contiguous and thus 
[n + 2, 2n + 2k] d A' + A'. 

Now consider Oi + M. Since M does not have a run of more than k missing elements, 
the worst case scenario (in terms of getting the required sums) is that the smallest ele- 
ment of M is n+2k and that the largest element is ra+m+l (and, of course, we still have 
at least one out of every k consecutive integers is in M). If this is the case then we still 
haveOi + Af D [{n + l) + {n+2k),{n + k) + {n+m+l)] = [2n+2k + l,2n+k+m+l]. 
We had already shown that A' + A' has all sums up to 2n + 2k; this extends the sumset 
to all sums up to 2n + A; + m + 1. 

All that remains is to show we have all sums in [2n + k + m + 2,2n + 2k + m]. This 
follows immediately from O1 + O2 = [2n + k + m + 2, 2n + 3k + m] C A' + A'. This 
extends our sumset to include all sums up to 2n+3k+m, which is well past our halfway 



Apply the arguments below to the set 2n + 2fc + m — A', noting that 1, 2ri + 2fc + m G A'. 
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mark of2n + 2k + m. Thus we have shown that A' + A' D [n + 2,3n + Ak + 2m + 1] . 

We now do a similar calculation for the difference set, which is contained in [—{2n + 
2k+m) + l, (2n+2k+m) — l]. As we have already analyzed the sumset, all that remains 
to prove A is a _P„-set is to show that A' — A' D [—n — 2k — m+l,n + 2k + m — l]. As 
all difference set^ are symmetric about and contain 0, it suffices to show the positive 
elements are present, i.e., that A' — A' 0> [1, n + 2k + m — 1]. 

We easily see [1, A; - 1] C A' - A' as [0, k - 1] C Oi - Oi. Now consider M - Oi. 
Again the worst case scenario (for getting the required differences) is that the least 
element of M is n + 2A; and the greatest is n + m + 1. With this in mind we see that 
M - Oi D [{n + 2k) - (n + k), (n + m + 1) - (n + 1)] = [k, m]. Now O2 - Oi D 
[{n + k + m + 1) — (n + k) , (n + 2k + m) — (n + 1)] = [m + 1,2k + m — 1], and we 
therefore have all differences up to 2 A; + m — 1. 

Since 2n E A we have 2n + 2k + m E A'. Consider {2n + 2k + m) — Oi = 
[n + k + m, n + 2k + m — 1]. Since k > n we see that n + k + m < 2k + m; this 
implies that we have all differences up to n + 2A; + m — 1 (this is because we already 
have all differences up to 2/c + m — 1, and n + A; + m is either less than 2k + m — 1, or 
at most one larger). □ 

Proof. Proof of Theorem ILlf l). The proof of the first part of Theorem 11.11 follows 
immediately. By Lemma 1X2] our new sets A{M] k) are P„-sets, and by Lemma |2T| 
they are also MSTD. All that remains is to show that the sets are distinct; this is done by 
requiring n+k+\ is not in our set (for a fixed k, these sets have elements ra+l, . . . , n+k 
but not n + k + 1; thus different k yield distinct sets). □ 



3. Lower bounds for the percentage of MSTDs 

To finish the proof of Theorem II. 1[ for a fixed n we need to count how many sets 
M of the form Oi U Af U O2 (see Theorem 1 1.1 1 for a description of these sets) of width 
r = 2A: + m can be inserted into a P„, MSTD set A of width 2n. As Oi and O2 are 
just intervals of k consecutive ones, the flexibility in choosing them comes solely from 
the freedom to choose their length k (so long as /c > n). There is far more freedom to 
choose M. 

There are two issues we must address. First, we must determine how many ways there 
are there to fill the elements of M such that there are no runs of k missing elements. 
Second, we must show that the sets generated by this method are distinct. We saw in 
the proof of Theorem lLlf l) that the latter is easily handled by giving A{M] k) (through 
our choice of M) slightly more structure. Assume that the element n + k + lis not in 
M (and thus not in A). Then for a fixed width r = 2k + m each value of k gives rise 
to necessarily distinct sets, since the set contains [n + \,n + k] but not n + k + 1. In 
our arguments below, we assume our initial P„, MSTD set A is fixed; we could easily 
increase the number of generated MSTD sets by varying A over certain MSTD sets of 
size 2n. We choose not to do this as n is fixed, and thus varying over such A will only 
change the percentages by a constant independent of k and m. 



Unless, of course, A is the empty set! 
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Fix n and let r tend to infinity. We count how many M's there are of width r such 
that in M there is at least one element chosen in any consecutive block of k integers. 
One way to ensure this is to divide M into consecutive, non-overlapping blocks of size 
k/2, and choose at least one element in each block. There are 2^/^ subsets of a block 
of size k/2, and all but one have at least one element. Thus there are 2^/^ — 1 = 
2'=/2(i — 2"''/^) valid choices for each block of size k/2. As the width of M is r — 2k, 
there are l^^^^l < — 3 blocks (the last block may have length less than A;/2, in which 
case any configuration will suffice to ensure there is not a consecutive string of k omitted 
elements in M because there will be at least one element chosen in the previous block). 

We see that the number of valid M's of width r - 2A; is at least 2"-^^ (l - 2-''/^) 
As Oi and O2 are two sets of k consecutive I's, there is only one way to choose either. 
We therefore see that, for a fixed k, of the 2'' = 2™+^'^ possible subsets of r consec- 

utive integers, we have at least 2''^^'^ (l — 2"^^/^) ^ are permissible to insert into A. 
To ensure that all of the sets are distinct, we require n+k + 1 ^ M; the effect of this is to 
eliminate one degree of freedom in choosing an element in the first block of M, and this 
will only change the proportionality constants in the percentage calculation (and not the 
r or k dependencies). Thus if we vary k from n to r/4 (we could go a little higher, but 
once k is as large as a constant times r the number of generated sets of width r is negli- 

gible) we have at least some fixed constant times 2^ YlkLn ~ 2^^"/^) MSTD 
sets; equivalently, the percentage of sets OiUAf UO2 with Oi of width k E {n, . . . , r/4} 
and M of width r — 2k that we may add is at least this divided by 2'', or some universal 
constant times 



(as A; > n and n is fixed, we may remove the —3 in the exponent by changing the 
universal constant). 

We now determine the asymptotic behavior of this sum. More generally, we can 
consider sums of the form 



For our purposes we take a = 2 and 6 = c = 1/2; we consider this more general sum 
so that any improvements in our method can readily be translated into improvements in 
counting MSTD sets. While we know (from the work of Martin and O' Bryant HMOU ) 
that a positive percentage of such subsets are MSTD sets, our analysis of this sum yields 
slightly weaker results. The approach in [MOJ is probabilistic, obtained by fixing the 
fringes of our subsets to ensure certain sums and differences are in (or not in) the sum- 
and difference sets. While our approach also fixes the fringes, we have far more possible 
fringe choices than in [MO] (though we do not exploit this). While we cannot prove a 
positive percentage of subsets are MSTD sets, our arguments are far more elementary. 

The proof of Theorem I l.ir 2) is clearly reduced to proving the following lemma, and 
then setting a = 2 and b = c = 1/2. 




(3.1) 




(3.2) 
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Lemma 3.1. Let 

5(a,6,c;r) = . (3.3) 

Then for any e > we have 



2ak \ 

k=n 



« S{a,h,c-T) « ^ \ , ■ (3.4) 



Proof. We constantly use (1 — l/x)''^ is an increasing function in x. We first prove the 
lower bound. For k > (log2 r)/b and r large, we have 

(in fact, for r large the last bound is almost exactly 1). Thus we trivially have 

'^'11 1 

fc=(log2 r)/b 

For the upper bound, we divide the fc-sum into two ranges: (1) bn < bk < logg r — 
log2(logr)'^; (2) loggr — log2(logr)'' <bk < br/A. In the first range, we have 

2bk 



J. 

I U\ lUiL I 

^ exp 



6(logr) 
cloggr 



< exp(^-^^-(logr)^-i^. (3.7) 

If (5 > 2 then this factor is dominated by r^^^^-(^°srf ^ ^ for any A for r 
sufficiently large. Thus there is negligible contribution from k in range (1) if we take 
5 = 2 + e/a for any e > 0. 

For k in the second range, we trivially bound the factors (l — 1/2^^)'^^'^'' by 1. We 
are left with 

^ (logr)°^ ^ 1 (logr)"^ 

2ak — j-ajb 2-^ 2"^ r°-l^ ' 

Combining the bounds for the two ranges with 5 = 2 + e/a completes the proof. □ 

Remark 3.2. The upper and lower hounds in Lemma I3.il are quite close, differing by 

a few powers o/logr. The true value will be at least (^^£1)"^^; we sketch the proof in 
Appendix^ 

Remark 3.3. We could attempt to increase our lower bound for the percentage of sub- 
sets that are MSTD sets by summing r from Rq to R ( as we have fixed r above, we are 
only counting MSTD sets of width 2n + r where 1 and 2n + r are in the set. Unfor- 
tunately, at best we can change the universal constant; our bound will still be of the 
order 1 / i?^. To see this, note the number of such MSTD sets is at least a constant times 
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^^flg (to get the percentage, we divide this by 2^). Ifr < R/2 then there are 
exponentially few sets. Ifr > R/2 then r^^ G [1/-R'^, IQ/ R% Thus the percentage of 
such subsets is still only at least of order 1 / R^. 

4. Generalizing our construction 

Instead of searching for A such that + > — yl|,we now consider the more 
general proble of when 

\eiA + --- + enA\ > \eiA + --- + enA\, e^, G {-1, 1}. (4.1) 
Consider the generalized sumset 

fj,,,,{A) = A + A + --- + A- A- A A, (4.2) 

where there are ji pluse£l and j2 minuses, and set j = ji + j2- Our notion of a P„-set 
generalizes, and we find that if there exists one set A with 1/^^^ j2(^)l > \fj[, jj^^)!' 
then we can construct infinitely many such A. Note without loss of generality that we 
may assume ji > j20 

Definition 4.1 (P^-set.). Let A C [1, k] with 1, /c, G A. We say A is a P^-set if any 
fji, 32 {A) contains all but the first n and last n possible elements. 

Remark 4.2. Note that a PnSet is the same as what we called a Pn-set earlier 

We expect the following generalization of Theorem 1 1.1 1 to hold. 

Conjecture 4.3. For any fj^^ and fy^^ y^, if there exists a finite set of integers A which 
is (1) a Pi-set; (2) A C [1, 2n] and 1, 2n G A; and (3) \ fj^^ jiiA)] > j'^{A)\, then 
there exists an infinite family of such sets. 

The difficulty in proving the above conjecture is that we need to find a set A satisfying 
\fji, j2iA)\ > l/jj, j'^iA)]; once we find such a set, we can mirror the construction from 
Theorem ll.il Currently we can only find such A for j G {2, 3}: 

Theorem 4.4. Conjecture \4.3\ is true for j G {2,3}. 

As the proof is similar to that of Theorem 11.1 [ we just highlight the changes. We 
prove the lemmas below in greater generality than we need for our theorem as this 
generality is needed to attack Conjecture 14.31 The first step is an analogue of Lemma 
12. 1[ the second is proving that a -P^-set is also a -P^-set, and the third is constructing 
sets A (when j = 3) to start the construction. 

Lemma 4.5. Let A = L U R be a P^-set, where L C [1, n], i? C [n + l,2n\. Form 
A' = LUMUR', where M C [n + l,n + m] and R' = R + m. If A' is a P^-set, then 

Un,,M')\-Un,,M)\ = \fr^,r^{A')\-\f,,^,,{A)\.ThusifUn,,M)\ > \k, j^{A)\, 
the same is true for A'. 

'^We do not consider the most general problem of comparing arbitrary combinations of A, contenting 
ourselves to this special case; see [HM| for some thoughts about such generalizations. 

^^By a slight abuse of notation, we say there are two sums in A + ^4 — A, as is clear when we write it 
as eiA + + csA. 

'"^This follows as we are only interested in j/j^^ j2(^)l' which equals j/j^. j^{A)\. This is because B 
and —B have the same cardinality, and thus (for example) we see A + A — A and —{A — A — A) have 
the same cardinality. 
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Proof. Since A C [1, 2n] and is a P^-set, we know f{A) C [ji — 2nj2, 2nji — j2\ and 
[ji - 2nj2 + ri, 2nji - j2 - n] C /(A). Note any elements in f{A) n [ji - 2nj2, ji - 
2nj2 + ^ — 1] can only come from L + L + L + -- - + L — R — R — R — — R. 

As A' C [1, 2n + m], f{A') C [ji - (2n + m)j2, {2n + m)ji - ^2] and [ji - {2n + 

'^)j2 H-'^, (2n + m)ji — j2 — C /(^)]. Any elements in /(A) n [ji — (2n + m)j2,ji — 

{2n+m)j2+n — l] can only come only from + - ■ ■ + L — R' — R' — R' — ■ R', 

which is simply a translation of L + L + L + -- - + L — R — R — R — — R. 

A similar argument works for the right fringe of .,2 (^')- Thus = 1/(^)1 + 

jm (this is because the potential width of fj^^ j2i^') J"^ more than that of /j^ j2(^)' 
and the two fringes of these sets are in a 1-1 correspondence). Since \fjj^^j^(A')\ — 
I /jj j2 (^) I depends only on j = ji +j2, it holds for any pair of forms with j coefficients, 
and the lemma is proven. □ 

Lemma 4.6. For j > 3, any P^-set is also a Pl-set. 

Proof. Let A be a P^-set, where A C [1, k] and l,k E A. Assume k > 2n. Then 
A + An [n + 2,2k- n] = [n + 2,2k- n] (as A is a P^-set). 

Let /jj^ j2 be a form with j > 3, and thus either ji or j2 is at least 2; without loss of 
generality we assume ji > 2. There is a form /ji-2, j2 such that /ji-2, j2 (^) + ^ + ^ = 
/ji, i2(^)- The proof follows by showing /ji_2, j2({l5 fc})+v4+ A contains all necessary 
elements, namely [ji — kj2 + n, jik—j2 — n]. (By /ji-2, j2 ^}) ^e mean all numbers 
of the form eiOi + • ■ ■ + ej_2aj_2, with the Cj the coefficients of the form /ji-2,j2 and 
tti G {1, k}.) We have 



To see this, we first consider i < ji — 2. For such i, for the positive summands choose 
1 a total of ji — 2 — i times and k a total of i times, while for the negative summands 
we choose k each of the j2 times. If now ji — 2 < z < j — 2, for the positive summands 
we choose k a total of z — j2 times (which is permissible as this is at most ji — 2) and 
we choose 1 the remaining ji — 2 — (i — j2) times, while for the negative summands we 
choose 1 all j2 times. This leads to a sum ofk-{i— J2) + ^ ■ {ji — 2 + j2 — i) — I ■ j2, 
which equals ji ~ 2 — i + k(i — ^2) as claimed. Unfortunately, this argument fails if 
i = ji — 1 and ji = j2, as we would then be choosing k from the positive summands 
negative one timesQ We are thus left with showing that we may obtain the sum —1 — k 
in this special case. As ji = j2, we just choose 1 for the ji — 2 positive summands and 
— 1 for all but one of the j2 negative summands (where we choose one to be k). 
As A is a P^-set, A + Ad [n + 2,2k-n]. Thus 

i-2 



fn-2,U{^^k}) D {j,-2-i + k{i-j2)\0<z<j-2}. 



(4.3) 




(4.4) 



where 



Li 



ii-2-i + k{i- J2) + n + 2 
ii — 2 — i + k{i — i2) + 2k — n. 



(4.5) 



This is the only bad case we need consider, as we know ji > j2, and the only problem arises when 
i-j2< 0. 
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We see that Lq = ji — kj2 + n and Uj-2 = jik — j2 — n, our two desired endpoints. 
The proof is completed by showing the intervals [Lj, Ui] cover the desired interval and 
has no gap with its neighbors. 

Since 2n < k, we have: 

Li -I = ji-i + k{i- J2) + n-l 

= (ji ~ i + ki - j2k - 1) + n 

^ (ji — i + ki — — 1) + k — n 

= ji-2-{i-l) + k{{i-l)-j2) + 2k-n 

< U:,^l. (4.6) 

Thus there are no gaps between the intervals [Lj, f/j] and they therefore 

cover the necessary range. □ 

Remark 4.7. Note that the above lemma is false if the size ofn is unrestricted. To take 
an extreme example, let A = {1, 10} and n = 9. Then A is a Pn-set (11 E A + A, G 
A — A) but A is not a P^-set. 

Proof of Theorem \4~4\ Lemmas [4.51 and \4~6\ imply that the sets described in Lemma [Z2l 
also work in our generalized case. The counting argument of ^ requires no modifica- 
tion. Thus the theorem is proved provided we can find an A to start the process. 

The following set was obtained by taking elements in {2, . . . , 49} to be in A with 
probabilit>{3l/3 (and, of course, requiring 1, 50 G A); it took about 300000 sets to find 
the first one satisfying our conditions: 

A = {1,2,5,6,16,19,22,26,32,34,35,39,43,48,49,50}. (4.7) 

TobeaPgVsetweneedtohave A+A+A D [n+3,6n-n] = [28,125] and A+ A- A D 
[-n + 2, 3n - 1] = [-23, 74]. A simple calculation shows A + A + A= [3, 150], all 
possible elements, while A + A — A = [—48, 99]\{— 34} (i.e., every possible element 
but -34). Thus A is a P^^-set satisfying |A + A + A| > |A + A - A|, and thus we have 
the example we need to prove Theorem 14. 4[ □ 

Remark 4.8. We could also have taken 

A = {1, 2, 3, 4, 8, 12, 18, 22, 23, 25, 26, 29, 30, 31, 32, 34, 45, 46, 49, 50}, (4.8) 
which has the same A + A + A and A + A — A. 



5. Concluding remarks and future research 

One avenue of future research is to complete the proof of Conjecture 14. 3 1 and give an 
elementary example of an infinite family of sets satisfying | /ji , (^) I > I fj[ , I • 
have reason to believe the correct model is to look for P^ -sets by choosing the numbers 
{2, . . . , 2n — 1} to be in A with probability 1/j (and, of course, requiring 1, 2n G A). 
Unfortunately the density of such sets appears to decrease rapidly with n, and to date 
straightforward computer searches have been unsuccessful when j = 4. As we shall see 
below, perhaps a better algorithm would incorporate choosing elements near the fringes 



'Note the probability is 1/3 and not 1/2. 
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Estimated 7(k,n) 




Figure 1 . Estimation of 7 (A;, 100) as A; varies from 1 to 100 from a 
random sample of 4458 MSTD sets. 



(i.e., near 1 and 2n) witli a different probability tlian 1/j. 

We also observed earlier (Footnote [6l) that for a constant < a < 1, a set randomly 
chosen from [1, 2n] is a P^anj-set with probability approaching 1 as n ^ 00. MSTD 
sets are of course not random, but it seems logical to suppose that this pattern continues. 

Conjecture 5.1. Fix a constant < a < 1/2. Then as n ^ 00 the probability that a 
randomly chosen MSTD set in [1, 2n] containing 1 and 2n is a Pian\ set goes to 1. 

In our construction and that of HMOH . a collection of MSTD sets is formed by fixing 
the fringe elements and letting the middle vary. The intuition behind both is that the 
fringe elements matter most and the middle elements least. Motivated by this it is inter- 
esting to look at all MSTD sets in [l,n] and ask with what frequency a given element is 
in these sets. That is, what is 

k e AandAisanUSTD set} 
'""^ ~ A is an MSTD set} ^ 

as n 00? We can get a sense of what these probabilities might be from Figure [B 

Note that, as the graph suggests, 7 is symmetric about i.e. 'j{k, n) = 7(n + 1 — 
k, n). This follows from the fact that the cardinalities of the sumset and difference set 
are unaffected by sending x ^ ax + (3 for any a, (3. Thus for each MSTD set A we get 
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a distinct MSTD set n + 1 — A showing that our function 7 is symmetric. These sets 
are distinct since ifA = n + l — A then A is sum-difference balanced^ 

From flMOU we know that a positive percentage of sets are MSTD sets. By the central 
limit theorem we then get that the average size of an MSTD set chosen from [1, n] is 
about n/2. This tells us that on average 'j{k, n) is about 1/2. The graph above suggests 
that the frequency goes to 1/2 in the center. This leads us to the following conjecture: 

Conjecture 5.2. Fix a constant Q < a < 1/2. Then lim^^oo 7(^5 = ^/'^for [an\ < 
k < n — Yan\ . 

Remark 5.3. More generally, we could ask which non-decreasing functions f{n) have 
f{n) — > 00, n — f{n) — > 00 and limn-^ool{k,n) = 1/2 for all k such that [/(^)J < 
k<n-[f{n)\. 

Appendix A. Size of S{a, b, c; r) 
We sketch the proof that the sum 

r/4 . . r/ck 



S{a,b,c;r) = ^"^ (^.1) 

k=n ^ ' 



is at least {^2&L^|^I^ _ We determine the maximum value of the summands 

/(a,6,c;fc,r) = ^(^l-^j . (A.2) 

Clearly /(a, &, c; fc, r) is very small if k is small due to the second factor; similarly it is 
small if k is large because of the first factor. Thus the maximum value of /(a, 6, c; k, r) 
will arise not from an endpoint but from a critical point. 

It is convenient to change variables to simplify the differentiation. Let u = 2^ (so 
k = \ogu/ log 2). Then 



g{a,b,c;u,r) = f{a,b,c;k,r) = u " - — j . (A.3) 

Thus 

(T I02" 2 \ 
Ti • (^-4) 
cu" log u J 

Maximizing this is the same as minimizing h(a, b, c; u, r) = l/g{a, b, c; u, r). After 
some algebra we find 

h'{a,b,c;u,r) = M^l^li^^ (acu' log^w - rlog2 ■ (6 log u + 1)) . (A.5) 

CM log U 

Setting the derivative equal to zero yields 

acM^ log^M = r log2 ■ (folog-u + 1) . (A. 6) 



^^The following proof is standard (see, for instance, IINa2ll ). lfA = n+ l — A then 

l^ + ^l = \A+{n + l-A)\ = \n+l + {A-A)\ = \A~A\. (5.2) 
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As we know u must be large, looking at just the main term from the right hand side 
yields 

acu^logu a; r61og2, (A.7) 

or 

uHogu ^ Cr, C = (A.8) 

ac 

To first order, we see the solution is 

W= (^)' (A.9) 

Straightforward algebra shows that the maximum value of our summands is approxi- 
mately {C'e^l^y^ ^osryl\ 
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